perm filename FOO.PUB[LSP,JRA]12 blob
sn#350967 filedate 1978-04-22 generic text, type T, neo UTF8
.SS(Symbolic Expressions: Abstract Data Structures)
.begin "sexp"
.FP
We wish to show that the
use of abstraction will benefit the study of data structures and LISP.
To begin our study we should therefore characterize the domain
of LISP data structures in a manner similar to what we did for numbers.
Our objects are called %2Symbolic Expressions%*.
Our domain of Symbolic Expressions is named %d<sexpr>%*.
⊗→Symbolic expressions↔← are also known as %2⊗→S-expressions↔←%*
or %2⊗→S-exprs↔←%*.
The set of symbolic expressions is defined inductively over a base set
named %d<atom>%*.
The set %d<atom>%* can itself be defined inductively. We give a
set of BNF equations
for elements of <atom> below, but the essential character of the
domain is that it represents two kinds of objects: the %2⊗→literal atoms↔←%*
and the integers.
The elements of %d<atom>%* are called %2atoms%*.
.BEGIN TABIT1(19);GROUP;
.P115:
.BOXA
.KRK
<atom>\:: = <literal atom> | <numeral> | -<numeral>
.PT2
<literal atom>\:: = <atom letter>
\:: = <literal atom><atom letter>
\:: = <literal atom><digit>
.PT2;
<numeral>\:: = <digit> | <numeral><digit>
.PT2
<atom letter>\:: =%3 A | B | C ... | Z%* ⊗↓We use ellipses here as a convenient abbreviation.←
.PT2
<digit>\:: = %30 | 1 | 2 ... | 9%*
.BOXB
.END;
.FP
A <literal atom> is therefore a string of uppercase letters and digits, subject
to the provision that the %6first%* character in the atom be a letter.
.BEGIN TABIT2(20,35);GROUP;
.BOXA
.KRK
For example:\atoms\non atoms
.PT18;
%3\ABC123\2A
.PT2
\12\a
.PT2
\A4D6\$$g
.PT2
\NIL\ABD.
.PT2
\T\(A . B)
%*
.BOXB
.END;
The characteristics of atoms which most interest us are their distinguishability:
the atom %3ABC%* is distinguishable from the atom %3AB%*. That
%3"AB"%* is a
part of %3"ABC"%* is not germane to our current discussion.⊗↓However, we
will discuss such topics in {yonss(P27)} on string processing.←
Similarly, we will seldom need to exploit numerical relationships underlying
the numerals; at most we will use simple counting properties.
Therefore most of our discussions will deal with non-numeric atoms.
Most implementations
of LISP do however contain a large arithmetic entourage.
Many implementations also
give a wider class of literal atoms, allowing some special characters to appear;
for most of our discussion the above class is quite sufficient.
.GROUP;
The domain of Symbolic expressions, called %d<sexpr>%* is defined inductively over
the domain %d<atom>%*.⊗↓We will not give the termination clause,
but it is assumed to hold.←
.PT24;
.NL;
%21.%*##Any element of %d<atom>%* is an element of %d<sexpr>%*.
.NL;
%22.%*##If %9α%41%1 and %9α%42%1 are elements of %d<sexpr>%*,
then the pair of %9α%41%1 and %9α%42%1 is in %d<sexpr>%1. Pairs
are also called
%2⊗→dotted-pairs↔←%* since their standard
representation in LISP is %3(%9α%41%3.%9α%42%3)%1.
.PT24
.APART
.FP;
Thus %d<sexpr>%* includes %d<atom>%* as a proper subset.
The notation we chose for the dotted-pairs is the following:
.BEGIN INDENT 10,10,10;
.BOXA
A dotted-pair consists of a left-parenthesis followed by an
S-expr, followed by a period, followed by an S-expr,
followed by a right-parenthesis.
.BOXB
.END
.FP
For example, let %9α%41%1 be %3(A#.#B)%1 and %9α%42%1 be %3(1#.#T)%1;
then %3(%9α%41%1#.#%9α%42%1)%1 is %3((A#.#B)#.#(1#.#T))%1.
Greek letters %9α%* and %9β%* will be used in the text
to designate pattern matches.
In the current context the pattern matches will involve
S-expressions; they can match any well-formed S-expression.
For a further example,
let %3(A#.#(B#.#C))%* be %3(%9α%*#.#%9β%3)%1 then
%9α%1 is %3A%* and %9β%* is %3(B#.#C)%1.
These variables are called %2⊗→match-variable↔←s%* or %2⊗→meta-variables↔←%1.
Finally here's a BNF description of the full set of S-expressions.
.BEGIN TURN ON "←";
.BOXA
.P47:
.KRK
←<sexpr> :: = <atom> | %3(%*<sexpr> . <sexpr>%3)
.BOXB
.END;
Notice that if we allow real numbers as atoms then some care would need to be
exercised when writing S-expressions.
For example, should %3(3.1.2)%* be interpreted
as the dotted pair %3(3 . 1.2)%*,
as the dotted pair %3(3.1#.#2)%*,
or is it just an ill-formed expression?
Interpretation of such ambiguous constructs will depend on the implementation; such
details are discussed later.